Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Progress in machine learning and artificial intelligence promises to advance research and understanding across a wide range of fields and activities. In tandem, increased awareness of the importance of open data for reproducibility and scientific transparency is making inroads in fields that have not traditionally produced large publicly available datasets. Data sharing requirements from publishers and funders, as well as from other stakeholders, have also created pressure to make datasets with research and/or public interest value available through digital repositories. However, to make the best use of existing data, and facilitate the creation of useful future datasets, robust, interoperable and usable standards need to evolve and adapt over time. The open-source development model provides significant potential benefits to the process of standard creation and adaptation. In particular, data and meta-data standards can use long-standing technical and socio-technical processes that have been key to managing the development of software, and which allow incorporating broad community input into the formulation of these standards. On the other hand, open-source models carry unique risks that need to be considered. This report surveys existing open-source standards development, addressing these benefits and risks. It outlines recommendations for standards developers, funders and other stakeholders on the path to robust, interoperable and usable open-source data and metadata standards.more » « less
- 
            The Human Connectome Project (HCP) has become a keystone dataset in human neuroscience, with a plethora of important applications in advancing brain imaging methods and an understanding of the human brain. We focused on tractometry of HCP diffusion-weighted MRI (dMRI) data. We used an open-source software library (pyAFQ;https://yeatmanlab.github.io/pyAFQ) to perform probabilistic tractography and delineate the major white matter pathways in the HCP subjects that have a complete dMRI acquisition (n = 1,041). We used diffusion kurtosis imaging (DKI) to model white matter microstructure in each voxel of the white matter, and extracted tract profiles of DKI-derived tissue properties along the length of the tracts. We explored the empirical properties of the data: first, we assessed the heritability of DKI tissue properties using the known genetic linkage of the large number of twin pairs sampled in HCP. Second, we tested the ability of tractometry to serve as the basis for predictive models of individual characteristics (e.g., age, crystallized/fluid intelligence, reading ability, etc.), compared to local connectome features. To facilitate the exploration of the dataset we created a new web-based visualization tool and use this tool to visualize the data in the HCP tractometry dataset. Finally, we used the HCP dataset as a test-bed for a new technological innovation: the TRX file-format for representation of dMRI-based streamlines. We released the processing outputs and tract profiles as a publicly available data resource through the AWS Open Data program's Open Neurodata repository. We found heritability as high as 0.9 for DKI-based metrics in some brain pathways. We also found that tractometry extracts as much useful information about individual differences as the local connectome method. We released a new web-based visualization tool for tractometry --- “Tractoscope” (https://nrdg.github.io/tractoscope). We found that the TRX files require considerably less disk space - a crucial attribute for large datasets like HCP. In addition, TRX incorporates a specification for grouping streamlines, further simplifying tractometry analysis.more » « less
- 
            Stem‐mapped forest stands offer important opportunities for investigating the fine‐scale spatial processes occurring in forest ecosystems. These stands are areas of the forest where the precise locations and repeated size measurements of each tree are recorded, thereby enabling the calculation of spatially‐explicit metrics of individual growth rates and of the entire tree community. The most common use of these datasets is to investigate the drivers of variation in forest processes by modeling tree growth rate or mortality as a function of these neighborhood metrics. However, neighborhood metrics could also serve as important covariates of many other spatially variable forest processes, including seedling recruitment, herbivory and soil microbial community composition. Widespread use of stem‐mapped forest stand datasets is currently hampered by the lack of standardized, efficient and easy‐to‐use tools to calculate tree dynamics (e.g. growth, mortality) and the neighborhood metrics that impact them. We present the forestexplorR package that facilitates the munging, exploration, visualization and analysis of stem‐mapped forest stands. By providing flexible, user‐friendly functions that calculate neighborhood metrics and implement a recently‐developed rapid‐fitting tree growth and mortality model, forestexplorR broadens the accessibility of stem‐mapped forest stand data. We demonstrate the functionality of forestexplorR by using it to investigate how the species identity of neighboring trees influences the growth rates of three common tree species in Mt Rainier National Park, WA, USA. forestexplorR is designed to facilitate researchers to incorporate spatially‐explicit descriptions of tree communities in their studies and we expect this increased diversity of contributors to develop exciting new ways of using stem‐mapped forest stand data.more » « less
- 
            null (Ed.)Abstract Background Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. Results The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. Conclusion Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets.more » « less
- 
            Neighborhood models have allowed us to test many hypotheses regarding the drivers of variation in tree growth, but require considerable computation due to the many empirically supported non-linear relationships they include. Regularized regression represents a far more efficient neighborhood modeling method, but it is unclear whether such an ecologically unrealistic model can provide accurate insights on tree growth. Rapid computation is becoming increasingly important as ecological datasets grow in size, and may be essential when using neighborhood models to predict tree growth beyond sample plots or into the future. We built a novel regularized regression model of tree growth and investigated whether it reached the same conclusions as a commonly used neighborhood model, regarding hypotheses of how tree growth is influenced by the species identity of neighboring trees. We also evaluated the ability of both models to interpolate the growth of trees not included in the model fitting dataset. Our regularized regression model replicated most of the classical model’s inferences in a fraction of the time without using high-performance computing resources. We found that both methods could interpolate out-of-sample tree growth, but the method making the most accurate predictions varied among focal species. Regularized regression is particularly efficient for comparing hypotheses because it automates the process of model selection and can handle correlated explanatory variables. This feature means that regularized regression could also be used to select among potential explanatory variables (e.g., climate variables) and thereby streamline the development of a classical neighborhood model. Both regularized regression and classical methods can interpolate out-of-sample tree growth, but future research must determine whether predictions can be extrapolated to trees experiencing novel conditions. Overall, we conclude that regularized regression methods can complement classical methods in the investigation of tree growth drivers and represent a valuable tool for advancing this field toward prediction.more » « less
- 
            Dimitriadis, Stavros I (Ed.)The analysis of brain-imaging data requires complex processing pipelines to support findings on brain function or pathologies. Recent work has shown that variability in analytical decisions, small amounts of noise, or computational environments can lead to substantial differences in the results, endangering the trust in conclusions. We explored the instability of results by instrumenting a structural connectome estimation pipeline with Monte Carlo Arithmetic to introduce random noise throughout. We evaluated the reliability of the connectomes, the robustness of their features, and the eventual impact on analysis. The stability of results was found to range from perfectly stable (i.e. all digits of data significant) to highly unstable (i.e. 0 − 1 significant digits). This paper highlights the potential of leveraging induced variance in estimates of brain connectivity to reduce the bias in networks without compromising reliability, alongside increasing the robustness and potential upper-bound of their applications in the classification of individual differences. We demonstrate that stability evaluations are necessary for understanding error inherent to brain imaging experiments, and how numerical analysis can be applied to typical analytical workflows both in brain imaging and other domains of computational sciences, as the techniques used were data and context agnostic and globally relevant. Overall, while the extreme variability in results due to analytical instabilities could severely hamper our understanding of brain organization, it also affords us the opportunity to increase the robustness of findings.more » « less
- 
            null (Ed.)Abstract Tractography has created new horizons for researchers to study brain connectivity in vivo. However, tractography is an advanced and challenging method that has not been used so far for medical data analysis at a large scale in comparison to other traditional brain imaging methods. This work allows tractography to be used for large scale and high-quality medical analytics. BUndle ANalytics (BUAN) is a fast, robust, and flexible computational framework for real-world tractometric studies. BUAN combines tractography and anatomical information to analyze the challenging datasets and identifies significant group differences in specific locations of the white matter bundles. Additionally, BUAN takes the shape of the bundles into consideration for the analysis. BUAN compares the shapes of the bundles using a metric called bundle adjacency which calculates shape similarity between two given bundles. BUAN builds networks of bundle shape similarities that can be paramount for automating quality control. BUAN is freely available in DIPY. Results are presented using publicly available Parkinson’s Progression Markers Initiative data.more » « less
- 
            Abstract The neural pathways that carry information from the foveal, macular, and peripheral visual fields have distinct biological properties. The optic radiations (OR) carry foveal and peripheral information from the thalamus to the primary visual cortex (V1) through adjacent but separate pathways in the white matter. Here, we perform white matter tractometry using pyAFQ on a large sample of diffusion MRI (dMRI) data from subjects with healthy vision in the U.K. Biobank dataset (UKBB;N = 5382; age 45–81). We use pyAFQ to characterize white matter tissue properties in parts of the OR that transmit information about the foveal, macular, and peripheral visual fields, and to characterize the changes in these tissue properties with age. We find that (1) independent of age there is higher fractional anisotropy, lower mean diffusivity, and higher mean kurtosis in the foveal and macular OR than in peripheral OR, consistent with denser, more organized nerve fiber populations in foveal/parafoveal pathways, and (2) age is associated with increased diffusivity and decreased anisotropy and kurtosis, consistent with decreased density and tissue organization with aging. However, anisotropy in foveal OR decreases faster with age than in peripheral OR, while diffusivity increases faster in peripheral OR, suggesting foveal/peri‐foveal OR and peripheral OR differ in how they age.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
